Evaluation of Decision Tree Pruning with Subadditive Penalties
نویسندگان
چکیده
Recent work on decision tree pruning [1] has brought to the attention of the machine learning community the fact that, in classification problems, the use of subadditive penalties in cost-complexity pruning has a stronger theoretical basis than the usual additive penalty terms. We implement cost-complexity pruning algorithms with general size-dependent penalties to confirm the results of [1]. Namely, that the family of pruned subtrees selected by pruning with a subadditive penalty of increasing strength is a subset of the family selected using additive penalties. Consequently, this family of pruned trees is unique, it is nested and it can be computed efficiently. However, in spite of the better theoretical grounding of cost-complexity pruning with subadditive penalties, we found no systematic improvements in the generalization performance of the final classification tree selected by cross-validation using subadditive penalties instead of the commonly used additive ones.
منابع مشابه
Evaluation of liquefaction potential based on CPT results using C4.5 decision tree
The prediction of liquefaction potential of soil due to an earthquake is an essential task in Civil Engineering. The decision tree is a tree structure consisting of internal and terminal nodes which process the data to ultimately yield a classification. C4.5 is a known algorithm widely used to design decision trees. In this algorithm, a pruning process is carried out to solve the problem of the...
متن کاملAnomaly Detection Using SVM as Classifier and Decision Tree for Optimizing Feature Vectors
Abstract- With the advancement and development of computer network technologies, the way for intruders has become smoother; therefore, to detect threats and attacks, the importance of intrusion detection systems (IDS) as one of the key elements of security is increasing. One of the challenges of intrusion detection systems is managing of the large amount of network traffic features. Removing un...
متن کاملAppears in Ecml-98 as a Research Note a Longer Version Is Available as Ece Tr 98-3, Purdue University Pruning Decision Trees with Misclassiication Costs 1 Pruning Decision Trees
We describe an experimental study of pruning methods for decision tree classiiers when the goal is minimizing loss rather than error. In addition to two common methods for error minimization, CART's cost-complexity pruning and C4.5's error-based pruning, we study the extension of cost-complexity pruning to loss and one pruning variant based on the Laplace correction. We perform an empirical com...
متن کاملNew algorithms for learning and pruning oblique decision trees
In this paper, we present methods for learning and pruning oblique decision trees. We propose a new function for evaluating different split rules at each node while growing the decision tree. Unlike the other evaluation functions currently used in literature (which are all based on some notion of purity of a node), this new evaluation function is based on the concept of degree of linear separab...
متن کاملToward a Theoretical Understanding of Why and When Decision Tree Pruning Algorithms Fail Keywords: decision-tree learning, theory of model selection and evaluation
Recent empirical studies revealed two surprising pathologies of several common decision tree pruning algorithms. First, tree size is often a linear function of training set size, even when additional tree structure yields no increase in accuracy. Second, building trees with data in which the class label and the attributes are independent often results in large trees. In both cases, the pruning ...
متن کامل